METHOD AND DEVICE FOR RECORDING OF INFORMATION 

5 

TECHNICAL FIELD OF THE INVENTION 

The present invention relates to a method and a 
device for recording of text by imaging the text on an 
optical sensor in a handheld device, said sensor being 
10 intended for digital recording of images. 

BACKGROUND OF THE INVENTION 

It is sometimes desired to abstract parts of text or 
image information in a document which may later be edited 

15 using appropriate software in a computer. A known method 
of inputting text and image information into a computer 
is to use a stationary or portable scanner. 

A stationary scanner is suitable for entering entire 
pages with text and image information, the scanner being 

20 automatically passed across the page at a constant speed. 
This type of scanner is not suited for inputting selected 
parts of information on a page. 

A portable, handheld scanner may be used any time 
interesting information is to be scanned, but normally 

25 has a limited field of view. 

US-5, 301,243 discloses a hand-held scanner for read- 
ing characters from a string of characters on a sub- 
strate. The scanner is moved in contact with the 
substrate along the character line and has an optical 

30 system which images a small part of the substrate. The 

optical system comprises a CCD type line sensor provided 
with a plurality of light-sensitive elements arranged in 
a line. When the scanner is passed across the characters 
on the substrate, a succession of vertical slices of the 

35 characters and of the spaces between them is recorded. 

The slices are stored in the scanner as a digital bit-map 
image. OCR software (OCR = Optical Character Recognition) 
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is then used to identify the characters and store them in 
character-coded form, such as ASCII code. The character 
recognition can by made either in the scanner or in an 
external computer to which the bit-map image is sent. 
5 Another type of hand-held scanner for inputting text 

is disclosed in US-4 , 949 , 391 . This scanner has a two- 
dimensional sensor which records images of the underlying 
surface as the scanner is being moved across the. same. 
The scanner is restricted to movements in a direction 
10 which is determined by a wheel in contact with the 

surface. Before the recorded images are assembled into a 
composite image, redundant information is removed from , 
the images. The composite image can be analyzed in a 
computer for identification of characters. 
15 A drawback with the handheld scanners described 

above is that their field of view is relatively small. In 

W order to record a large amount of information, such as 

flj 

III passages consisting of several lines, a user must 

13 therefore move the scanner back and forth across the 

fi I 

20 surface repeatedly. Moreover, the movement has to follow 
a predetermined path, such as along the lines of text. 

Publication WO 99/57678 discloses a device for 
recording information from a substrate. The device may 
operate in two modes, one scanner mode, in which lines of 

25 text is scanned, and a photograph mode, in which separate 
pictures are taken of a document or an object. 

Publication WO 98/20446 discloses a scanner pen, 
which is adapted to be moved over a line of text for 
scanning the text. As the pen moves over the text image, 

30 several pictures are taken of the text . The pictures are 
processed by a computer and assembled or stitched 
together for forming a composite image of the entire line 
of text, which cannot be read by a single picture. The 
scanner pen can only scan a single line of text at each 

35 time. 

Thus, there is a need for a handheld scanner pen of 
the above-mentioned type which is adapted to scan several 



lines of text simultaneously as well as smaller pictures. 

SUMMARY OF THE INVENTION 

An object of the present invention is to provide 
a scanner pen which enable fast recording of text in real 
time . 

Another object of the invention is to provide a 
scanner pen which may be used at a distance from the text 
and may scan several lines of text in a single stroke as 
well as discrete pictures. 

These objects are obtained by a method and a device 
for recording information by imaging on a light-sensitive 
sensor for obtaining at least two images of the 
information having partially overlapping contents. The 
method comprises converting the information in each of 
the images to a coded representation, comparing the coded 
representation of said images for determining an overlap 
position, and assemblying the images to form a composite 
image. The coded representation may be a character code, 
such as ASCII. Alternatively, the coded representation 
may comprise a division of the information inside 
boarders, such as rectangles, each comprising portions of 
the information, such as words included in said 
information. Thereafter, the composite image may be 
converted to a character code format, such as ASCI I -code. 
Alternatively, each image may be separetly converted into 
character code format, such as ASCII, before assemblying. 

The method may further comprise determining 
structures in each of said images, such as direction of 
lines or text line directions in each image. This may be 
accomplished by means of a Hough transformation of each 
image. This information may be used for adjusting the 
rotational position and/or perspective of each image in 
dependence of the direction of lines. The information may 
also be used for the division of the image in 
reactangles. 

A concept of the present invention is to record a 



plurality of images of a text, said images overlapping 
each other, each image comprising several lines of text. 
Subsequently, OCR (Optical Character Recognition) is 
carried out as regards the recorded images, resulting in 
5 sets of characters. The sets of characters may contain a 
number of characters indicating "end of line" if the text 
comprises several lines. Then the sets of characters are 
assembled using the characters in the sets of characters. 
An advantage of this mode of operation is that relatively 
t& 10 large images can be recorded at a time, without the 
assembling of the images being cumbersome since the 
images are converted into character codes before being . 
assembled. The effective resolution is small in a set of 
characters compared to a bit -map image, thus saving 
15 computing power. The effective resolution in the set of 

characters is a single character. Thus, assembling in two 
Ul dimensions may be possible with the present invention in 

Iy a handheld device. 

ru . ... • ■ • . 

The lines of text in two subsequent images do not 
20 necessarily coincide in the vertical direction. The first 
line in a first recorded image may correspond to the 
second line in a second recorded image. However, the 
assembling will adjust the vertical position so that 
correct assembling is obtained. 
25 By a set of characters is meant a plurality of 

characters, from which the relative positions of the 
characters can be determined. The set of characters may 
be a string of characters comprising characters for blank 
and end of line. 
30 . A device for recording a text images the text on a 

light-sensitive sensor with a two-dimensional sensor 
surface, which sensor is intended for digital recording 
of images of the text, said images having partly 
overlapping contents. The device is characterized in that 
35 it is adapted to convert at least two of the images 

recorded by the sensor into a set of characters each, 
comprising a plurality of characters, by means of 
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character recognition. The device is adapted to 
subsequently assemble the sets of characters with the aid 
of the characters in the sets of characters. 

By carrying out character recognition before 
5 assembling the images, the operation of assembling large 
images at pixel level can be omitted. Moreover, there is 
less risk that a character would not be recognized owing 
to poor assembling, which may be the case when digital 
images are assembled in pixel level and then character 
10 recognition is carried out in an area that is overlapped 
by both images that are assembled. Then, the character 
may be distorted if the assembling is not carried out , 
properly, which may results in that the character is not 
recognized in the character recognition process. 
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m ?= 15 According to this invention, the character recognition 



process take advantage of the original quality of the 
image for character recognition. By first converting the 
images into sets of characters, the actual assembling of 
the sets of characters may be quick since the number of 
y 20 characters is considerably smaller than the number of 

pixels in the recorded images. 

Moreover, by OCR interpretation of each image before 
the assembly thereof, a plurality of OCR interpretations 
of the same character will be obtained, one for each 
25 image where the character is included, and the 
interpretation which gives highest recognition 
probability can be selected. 

Alternatively, it is possible to assemble the sets 
of characters using words in the set of characters. Thus, 
30 entire words in one of the sets of characters that are to 
be assembled are compared with words in the other of the 
sets of characters that are to be assembled. When 
assembling words, it may be required to compare each 
individual character . 
35 By word is meant a plurality of characters which 

also includes special characters. The special characters 
are, for example, blank, full stop, comma or end of line. 
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The method may comprise finding the lines of text in 
the recorded images, to find start and end of words along 
the lines of text and to select which of the recorded 
images are to be converted into sets of characters with 
5 the aid of the identified start and end of the words in 
the recorded images, so that only images with the 
necessary information are converted into sets of 
characters or that images with duplicate information may 
be discarded. By identifying start and end of the words 
1^ 10 along the lines of text, it will be possible to make a 
O rough assembling of the images without first making 

ill optical character recognition. When the start and end of 

*K the words have been identified, the recorded images are 

iJ? corrected as regards rotation and perspective. By making 

J ? l 15 a rough assembling of the images, it will be possible to 
* 9x find out how the images overlap each other. The images 

iTi which contain only information that is available 

fU completely in other images then need not be converted 

rl| • • . ■ 

i=; into sets of characters. 

fiJ 20 Start and end of words along the lines of text may 

be identified by searching, in each pixel along a line 
through the lines of text, for the number of dark pixels 
a predetermined number of pixels up and down from the 
line of text. An end of a word being defined as if there 

25 are no dark pixels in a predetermined number of pixels 

above and below the line of text, i.e. there are blanks. 
To manage italics one may alternatively search along an 
oblique line. It is, of course, possible to search for 
white pixels, instead of dark pixels, if the text should 

30 be brighter than the background, i.e. inverted. 

The images may be converted into binary images, i.e. 
images containing merely black and white, since this 
facilitates the continued image processing. 

Moreover, the method may comprise finding the lines 

35 of text in the recorded images using the Hough 

transformation of the recorded images. The Hough 
transformation can briefly be described as follows. There 



is an infinite number of straight lines extending through 
a point in an XY plane. The equation of the straight line 
for each of these lines can be expressed with two 
parameters. If the parameters of the individual. lines are 
plotted in a diagram, a curve is obtained which 
corresponds to the Hough transform of the point . In this 
way, it is possible to plot curves for any of the points 
in the XY plane. The Hough transform of two different 
points will intersect in a point, which corresponds to 
the equation of the straight line extending through the 
two points. If the Hough transforms for all the dark 
pixels in a recorded image are plotted, a large number. of 
intersections between the different Hough transforms will 
be obtained. However, there is a maximum number of 
intersections for lines following the lines of text. 

The device may comprise a memory adapted to store 
the recorded images, which are to be converted into sets 
of characters, in the memory, and to convert the stored 
images into sets of characters after completion of the 
recording of the images. By only storing the recorded 
images which are to be converted into sets of characters, 
the memory space which is required for storing recorded 
images is minimized. By converting the stored images into 
sets of characters after the recording of images has been 
completed, it is not necessary to place high demands on 
the speed of the optical character recognition, which 
would be the case if it were to be carried out while 
images are being recorded. 

The device is advantageously designed in such a man- 
ner, that a user can hold it by hand and at a distance 
from a substrate to record text on the substrate.- 

The device may be adapted to correct the images for 
rotation before they are converted into sets of 
characters. In the case where the lines of text in the 
recorded images have been identified, the correction for 
rotation can be carried out in a relatively simple way. 
However, some optical character recognition programs can 



process also rotated images, in which case the rotation 
is not required. 

Correspondingly, the device may correct the images 
for perspective before they are converted into sets of 
5 characters since the optical character recognition may be 
facilitated if all letters have the same size in the 
images . 

The device may be designed as a reading head which 
is connected to a calculating unit in which the recorded 
10 images are processed. 
H The device may be adapted to assemble the sets of 

?J characters by comparing the sets of characters in pairs, 

In __ 

the sets of characters being compared in a number of 

HI relative positions displaced relative to each other. 

in 

**« 15 Thus, a first character in a first set of characters is 

~?= 

s compared with characters in the second set of characters 

r*, until correspondence is achieved or until the first 

flj character in the first set of characters has been 

Hi compared with all characters in the second set of char- 

m 20 acters. Subsequently, the correspondence of the second 
characters in the first set of characters is compared 
with the characters in the second set of characters. By 
making the comparison for a large number of different 
relative positions, a plurality of total numbers of 
25 points can be obtained, the total number of points 

reflecting the correspondence between the two sets of 
characters for the specific position. In this way, an 
optimum relative position can be obtained. 

The device may be adapted to store the recorded 
30 images that are to be converted into sets of characters 
along with a serial number indicating in which order the 
images have been recorded, and to assemble the sets of 
characters with the aid of the serial number for the 
images corresponding to the sets of characters. 
35 Especially in the case where first all images are record- 
ed and not until then the character recognition and the 
assembling are begun, the serial numbers may be used for 



9 

the recorded images since then a large number of images 
are to be assembled. 

According to a second aspect of the present inven- 
tion, a method is provided for recording of text on a 
substrate, comprising the step of imaging and digitally 
recording images of the text, the images having partly 
overlapping contents. The method is characterized in that 
it comprises the steps of converting at least two of the 
recorded images into a set of characters, each with a 
plurality of characters, by means of optical character 
recognition, and putting together the sets of characters 
with the aid of the characters in the sets of characters. 

The area recorded by the sensor may be arranged so 
that a plurality of lines of text are imaged in a 
recorded image . 

According to a third aspect of the present inven- 
tion, a computer-readable storage medium, in which a 
computer program is stored which is adapted to be used 
for conversion of digital images, which are recorded by 
an image sensor, into text. The storage medium is char- 
acterized in that the computer program comprises instruc- 
tions for making the computer receive digital images as 
input signals, convert the digital images into sets of 
characters, with a plurality of characters, by means of 
character recognition, and put together the sets .of char- 
acters with the aid of the characters in the sets of 
characters . 

Further objects, features and advantages of the 
invention will appear from the following detailed 
description of embodiments of the invention with 
reference to the drawings . 

BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 is a schematic view of a device according to 
a first embodiment of the present invention. 

Fig. 2 is a schematic block diagram of parts of the 
embodiment of Fig 1 . 



Fig. 3 is a schematic view of images of text on a 
sheet of paper, which are recorded according to the 
invention. 

Figs. 4a and 4b are diagrams, illustrating the 
principle of the Hough transformation. 

Fig. 5 is a diagram, which shows maximum points for 
the Hough transform of two different images. 

Fig. 6 is a diagram and a histogram for illustrating 
the detection of start and end points of words. 

Figs. 7a and 7b are diagrams for illustrating the 
division of the text images into words. 

Fig. 8 is a diagram of two images converted to 
characters for assembling. 

Fig. 9 is a diagram for illustrating lines of text 
in a recorded image. 

Fig. 10 is a flow chart of the operation of a com- 
puter program according to the invention. 

Fig. 11a, lib and 11c are shematic representations 
of a text, division thereof in rectangles and display on 
a small display. 

DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTION 

Fig. 1 discloses a scanner pen, comprises a casing 1 
having approximately the same shape as a conventional 
highlighter. In one short side of the casing there is an 
opening 2, which is intended to be directed at an area on 
a substrate which a user wants to image. The information 
carrier can be a sheet of paper. 

The casing 1 essentially contains an optics part 3, 
an electronic circuitry part 4 and a power supply part 5. 

The optics part 3 comprises a lens system 7, light- 
emitting diodes 6, and an optical sensor 8 constituting 
an interface with the electronic circuitry part 4 . The 
light-emitting diodes 6 may be used to increase the 
illumination. 

The light-sensitive optical sensor- 8 may be a two- 
dimensional CMOS unit or CCD unit (CCD = Charge Coupled 
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Device) with a built-in AD converter. Such sensors are 
commercially available. The sensor 8 may be mounted on a 
printed circuit board 11. 

The power supply to the device is obtained from a 
5 battery 12 which is mounted in a separate compartment 13 
in the casing. 

Fig. 2 is a block schema of the electronic circuitry 
part 4, which comprises a processor 2 0 which, via a bus 
21, is connected to a read-only memory ROM 22, in which 
10 the program of the processor is stored, to a write/read 
q memory RAM 23, which is the work memory of the processor 

O and in which the images from the sensor as well as • . 

jj* characters that are interpreted from the recorded images 

FU are stored, to a control logic unit 24 and to the sensor 

m 

™ 15 8. 

s The control logic unit 24 is connected to a number 

"H of peripherals, such as a display 2 5 mounted in the 

f|j casing, an IR transceiver or short-range radio link 26 

for transferring information to/from an external . 
pi 20 computer, buttons 21 by means of which the user can 

control the device, and an operation indicating device 2 8 
consisting of a second set of light-emitting diodes which 
may indicate whether recording occurs or not and other 
operating conditions. The control logic unit 24 generates 
25 control signals to the memories, the sensor and the 
peripherals. The control logic unit also manages the 
generation and the prioritization of interrupts to the 
processor. The buttons 27, the transceiver 26, the 
display 25 and the light -emit ting diodes 6 are controlled 
30 by the processor by writing and reading in the records of 
the control logic unit. The buttons 2 7 generate 
interrupts to the processor 2 0 when activated. 

The function of the device will now be described. A 
sheet of paper 9 is provided with a plurality of .lines of 
35 printed text 10 as shown in Fig. 3. When a user activates 
the scanner pen by means of the buttons 2 7 and passes it 
across the sheet of paper with the opening 2 directed 



towards the sheet of paper, three images 14, 15, 16 are 
recorded. Each of the images 14, 15, 16 is exposed to OCR 
processing and the text of the images is converted into 
sets of characters as . illustrated in Fig. 8. Subsequently 
5 the sets of characters are assembled or stitched so as to 
form a complete text. As shown in Fig. 3, the images 14, 
15, 16 may be rotated in relation to each other. Thus, 
a first image 14 is turned or rotated in relation to a 
second image 15, which in turn is rotated in relation to 

1^ 10 a third image 16 . 

In order to optimize the optical character recogni- 

§jj tion in the recorded images 14, 15, 16 it is advantageous 

4* to know the orientation of the lines of text in the 

til 

fil image. Therefore, the orientation of the lines of text is 

«P 15 detected before the character recognition is carried out. 
L, The detection of the orientation of the lines of 

lij text may be carried out using the Hough transformation. 

L:; Referring to Figs 4 and 5, the Hough transformation 

III 

p will now be generally described. Fig. 4a shows five 

fU 20 points in a plane with the coordinate axes X and Y. Fig 

4b shows the Hough transform of the five points in Fig. 
4a. A first point 18a has a first Hough transform curve 
19a which describes all the straight lines extending 
through the first point 18a in Fig. 4a as a function of 

25 the parameters 0 and p where 9 is the angle of the 

straight lines through the point and p is the distance of 
the straight lines from origin. The Hough transform 
curves have a sinusoidal shape. Correspondingly, the 
second 18b, third 18c, fourth 18d and fifth 18e points 

30 have a second 19b, third 19c, fourth 19d and fifth 19e 
Hough transform curve. The second 19b, third 19c and 
fourth 19d Hough transform curves in Fig. 4b intersect in 
a point. This point in Fig. 4b corresponds to a straight 
line in Fig. 4a extending through both the second point 

35 18b and the third point 18c and the fourth point 18d. 

Fig. 9 shows the second recorded image 15 from 
Fig. 3, which consists of a plurality of pixels in which 
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lines of text are to be found. For each black pixel in 
the second recorded image 15, a Hough transform curve is 
calculated. The Hough transforms are inserted in one and 
the same diagram. 
5 Fig. 9 shows a first line 32 and a second line 33. 

Lines having approximately the same direction as the 
first line 32 will intersect a larger number of points 
than lines having approximately the same inclination as 
the second line 3 3 since the dark pixels in the recorded 

10 image 15 are positioned along lines of text having 

approximately the same direction as the first line 32. 

Fig. 5 illustrates that the maximum number of 
intersections can be used to determine the orientation 
of the lines of text. The circles 73 correspond to points 

15 in the Hough transform diagram where several Hough 
transform curves of points in Fig. 9 intersect, i.e. 
maximum of intersections. The circles 73 corresponds to 
the image 15 and are positioned along a straight line 34. 
From the distance between the circles 73, the distance 

20 between the lines of text can be determined. The position 
of the intersecting line 34 along the 0 axis indicates 
the rotation of the recorded image. The second line 35 in 
Fig. 5 corresponds to the image 14 and extends through a 
plurality of maximum indicated by crosses 3 6 in the 

25 diagram. The inclination of the second line 35 indicates 
that the image has a perspective, i.e. the lines of text 
have different rotations. Also the sligth different 
distances between the crosses 36 indicate that the image 
has a perspective with larger distances between the lines 

30 at the lower portion of the line 36. The displacement of 
the second intersecting line 35 in relation to the first 
line. along the 0 axis indicates that the lines of text 
are rotated in the recorded image. By means of this 
information, the image may be adjusted for perspective 

35 and rotated, for example to the horizontal direction, 
which is the same direction as image 15. As appears, 
image 15 corresponding to line 34 need no adjustment, 
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while image 14 corresponding to line 35 needs adjustment 
of the perspective to make line 3 6 vertical and with 
approximately equidistant crosses 36 and adjustment as to 
the rotational position, to move line 35 to the same 
5 angle. 0 as image. 15, which may correspond to zero angle. 

After identification of the lines of text, an iden- 
tification of the start and end parts of the words in the 
recorded image is carried out. Fig. 6 indicates how the 
letter "e" 37 is detected. As shown in Fig. 6, the number 

10 of dark pixels 7 are counted in the vertical direction 

perpendicular to the line 38, which may be calculated as 
described above. The number of dark pixels is zero up t.o 
the start 3 9 of the letter "e" and will again be zero at 
the end 40 of the letter "e" . When the number of dark 

15 pixels has been zero for a predetermined period, this is 
detected as the end of a word. 

With reference to Fig. 7, the words are indicated as 
rectangles, the start 41 of the rectangles indicating the 
start of a word and the end 42 of the rectangles 

20 indicating the end of a word. Fig. 7a corresponds to a 
first recorded image 14 and Fig. 7b corresponds to a 
second recorded image 15. A length of a first word 43 
in Fig. 7a has correspondence in a length of a second 
word 44 in Fig. 7b. Correspondingly, a length of a third 

25 word 4 5 in Fig. 7a has correspondence in a length of a 

fourth word 4 6 in Fig. 7b. By matching the two images, it 
is possible to find out how the recorded images overlap 
each other, by only using the . graphical information of 
the length of each word. 

30 . Thus, by using the length of the words for each 

line, it is possible to carry out a rough putting- 
together or stitching or assembling of the two images. 
The images are roughly assembled so that a sequence of 
word lengths in the first recorded image corresponds to a 

35 sequence of word lengths in the second recorded image . 
The word lengths along different lines in the first 
recorded image should thus correspond to word lengths 
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along corresponding lines in the second recorded image. 
In this way, it is possible to determine how the images 
are displaced relative to each other. 

Fig. 3 shows how this may be used to sort out 
5 unnecessary images so that optical character recognition 
need not be carried out as regards all images that are 
recorded. A first image 4 7 and a second image 4 9 com- 
pletely overlap the area which is covered by a third 
recorded image 48 which is indicated by dashed lines. By 
10 using the method as described in connection with Fig. 7, 

H the third recorded image 48 can be completely omitted, 

p 

pi without optical character recognition being carried out. 

fees' 

01 Of course, Fig. 3 is only schematic, since the images 

r 5 

JJ' often overlap each other to a much larger extent. Indeed, 

i y 

pj 15 if the scanner pen is held approximately still, all 

images overlap more or less, and a substantial saving of 

s 

III computing power may be obtained by omitting images with 

W duplicate information. On the other hand, during normal 

jpsj scanning operation, a partial overlap may be used for 

□ 20 sorting out errors in the optical character recognition 

ft 5 

process or the assembling process, in which case 
overlapping images are not omitted or discarded. 

Another method of assembling the images is disclosed 
in Fig. 8, which shows a first set of characters 50 with 

25 a plurality of characters 60 corresponding to the first 
recorded image 4 7 in Fig. 3 and a second set of charac- 
ters 51 with a plurality of characters 61 corresponding 
to the second recorded image 49 in Fig. 3. The text in 
the first set of characters and the second set of char- 

30 acters follows the direction of the lines of text 62, 
which may have been obtained by the Hough transform 
process described above. The first set of characters 50 
and the second set of characters 51 are put together or 
assembled by comparing characters in the two sets of 

35 characters. Firstly, the first character 52 in the first 
set of characters is compared with each of the characters 
in the second set of characters 51. The operation 



proceeds correspondingly for the second character 63 and 
the third character 64 in the first set of characters 50. 
Good correspondence is obtained when the characters in 
the word "skilled" 53 in the second line of the first set 
of characters is compared with the word "skilled" 54 in 
the first line in the second set of characters 51. Since 
a word can appear in many positions in a text, one starts 
from the first correspondence found and then compares the 
rest of the text for this position, a total number of 
points being obtained which indicates how well the two 
sets of characters correspond for this position. 
Subsequently this step is repeated for the next position 
where correspondence is obtained. Finally, the position 
is selected in which the total number of points indicates 
the best correspondence. In Fig. 8, the text is in 
English but a person skilled in the art understands that 
the text could just as well be in any other language and 
that the text in Fig. 8 is only used to illustrate the 
function of the device. The image may also comprise 
symbols, such as mathematical expressions, or even line 
art, to a limited extent. 

The two assembling methods may be combined, so that 
the images are first compared in a rough manner and then 
on character level. In this way, it will be possible to 
carry out assmebling in two dimensions with limited 
computing power. If the reading device is first moved to 
the right like in Fig. 3 and then down and to the left, a 
larger surface is obtained which is to be assembled in 
two dimensions. By determining the mutual relationship of 
the. partial images by rough assembling and then on 
character level, it is relatively easy to obtain both 
horizontal and vertical assembling. The characteristic 
that the lines determine the vertical position with great 
accuracy results in the possibility of alignment. 

Each assembling method may alternatively be used 
separately. If the method using length of words is used 
separately, the images may be assembled into a composite 
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image before final conversion of the images to characters 
by optical character recognition. This method may be used 
when the images comprises other matter than text, such as 
small pictures or symbols. In this case, the symbols or 
5 small, pictures will be handled as if they were words and 
be included in a rectangle. The assembling process will 
work, provided that the pictures or symbols are 
sufficiently small. However, the OCR process will fail to 
recognize the rectangle as characters, and then the pixel 
10 representation may be used instead. Other types of 

boarders may be used than rectangles, such as an area 
delimited by more than four straight lines or even curved 
lines . 

If the assembling is carried out after the 
15 conversion of each image to character code such as ASCII, 
small pictures and symbols may be handled separately, at 
least if they are surrounded by text matter in each 
image . 

As appears from Fig. 7, the images may be processes 

20 by division of the image in boarders, such as rectangles. 
Each rectangle comprises a complete word or a symbol or a 
picture surrounded by white areas. Several pictures are 
compared for finding a succession of rectangles which 
corresponds to the wanted text. When overlapping 

25 positions have been determined, each rectangle is given a 
succession number, comprising the line number and the 
word number on that line. Thus, in Figs. 7a and 7B, the 
word 41 obtains designation 1:1, the next word 1:2, and 
the following word 1:3. On the second line, the first 

30 work obtains the designation 2:1, the second word 2:2, 
the third word 2:3 and the fourth word 2:4. 
Correspondingly, on the third line, the first word 
obtains designation 3:1, the second word 3:2, the third 
word 3:3 and the fourth word 3:4. It is now recognized 

35 that the fourth word on the second row, 2:4 corresponds 

to the second word on line 2 of Fig. 7b, which means that 
the first word at the second line of Fig. 7b obtains 



designation 2:3, the second word 2:4 (corresponding to 
the fourth word of line 2 of Fig. 7a), the third word 2:5 
and the fourth word 2:6. The same goes for the third line 
of Fig. 7b, in which the first word obtains designation 
5 3:3, the second word 3:4, the third word 3:5 and the 

fourth word 3:6. The words are then arranged in a row of 
words forming a complete line. 

It can now be seen that several words are duplicated 
in the two pictures, namely word 2:4 and word 3:4. There 

10 are further words which are duplicated on the other 

lines. These duplications may be omitted and replaced by 
a single word. Else the duplications may be kept and used 
for increasing the OCR interpretation accuracy. 

Finally, the words are OCR processed in the right 

15 order to obtain the desired text. 

If any rectangle is larger in the vertical direction 
than a single line, it may obtain designation like 2,3:6, 
if it occupies lines two and three. In this way, larger 
objects such as pictures or symbols may be handled. 

20 There are a number of cases in which partial words 

are included in Fig. 7a but included in full in Fig. 7b, 
such as word 1:3. In this case, the longest version of 
the word is used for interpretation. If there is a doubt 
if this is correct, all fragments may be used to recreate 

25 the complete word. 

In this way, the. images are assembled on a word 
basis starting from the pixel representation and dividing 
the image inside boarders, such as rectangles which are 
compared for the best overlapping position. Then, 

30 duplicate information is omitted or used for further 
accuracy and then the words are arranged in the right 
order and finally converted to ASCI I -code. 

Pictures or symbols which may not be recognised by 
the OCR program, may be maintained in pixel format and 

35 displayed as such in the final picture. For increased 
safety, also at least one version of the words may be 
kept in pixel format in parallel with the OCR-version, 
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especially if the OCR program indicates a poor quality of 
processing . 

Every new image is compared with previous 
information in order to find out its orientation therein. 
5 Thus,, each image is processed both in the vertical 

direction as well as in the horizontal direction, both 
forwards and backwards. Thus, it is possible to scan in 
two dimensions by the present invention. This is possible 
because of the division of the image in a coded 

10 representation which is less time consuming to process, 
either in the form of rectangles or similar or in the 
form of ASCII code. 

It may be of interest to have some kind of feed-back 
that the desired information is gathered. This may be 

15 accomplished by displaying the assembled information at 
the screen. However, since the display at a handheld 
device is rather small, another method would be to 
display the lines as a succession of pixels at the 
display, in which one pixel approximately corresponds to 

20 a single character. Then, the characters will form words 
and the layout of words would give a visual indication of 
the scanned surface. If the processing comprises division 
into rectangles, these rectangles may be indicated as 
they are assembled. 

25 Figs 11a, lib and 11c show how this may be 

accomplished. Fig. 11a is a text that is to be scanned. 
Fig. lib is the division of this text in rectangles. 
Finally, Fig. 11c is the representation of the rectangles 
at a small display, in which rectangles are indicated in 

30 black pixels and spaces between the rectangles as gray 
pixels. From Fig. 11c it can be seen that some 
information is missing as indicated by white pixels 29. 
The user then directs his scanner pen towards the area 
missing until the display indicates that all areas are 

35 fully covered by at least one image. Finally, the comlete 
image is converted to ASCII, if that has not been done 
earlier in the process. 



If the assembling is done by using the coded 
representation in the nature of ASCII code, each decoded 
character is displayed as a black dot an the display 
screen, while spaces are displayed as grey dots. Any 
white, dot will indicate that information is missing, as 
described above . 

According to the present invention, it is required 
that the information at least partially is positioned 
along identifiable lines. If the device is passed across 
a photograph or some other surface which is not divided 
into lines, this can easily be recorded by the processor 
in the device, and this part of the image may be 
discarded or stored separately as an picture or 
photograph. If said surface is completely or at least 
partially surrounded by lines, it would be possible to 
handle the situation via the invention, as soon as at 
least a portion of a line is included in every image. 

Fig. 10 is a flow chart of the operation of a com- 
puter program according to the invention. The computer 
program is adapted to be executed in the electronic cir- 
cuitry part 4 of the device. In a first step 55, digital 
images are received from the sensor 8 . In a second step 

56, the digital images are converted into strings of 
characters using character recognition. In a third step 

57, the strings of characters are assembled or put 
together . 

It is not necessary to carry out the rough putting- 
together of the recorded images as described in 
connection with Fig. 7, and optical character recognition 
can. be carried out directly as regards the recorded 
images. Moreover, the orientation of the lines of text 
need not be identified in the recorded images if an 
optical character recognition algorithm is used which is 
able to identify characters also when the lines of text 
are rotated. 

A person skilled in the art realizes that the inven- 
tion is not limited to the embodiments shown and that 



many modifications are feasible within the scope of the 
invention. The invention is only limited by the appended 
patent claims . 



